82 research outputs found

    The functional spectrum of low-frequency coding variation

    Get PDF
    Background Rare coding variants constitute an important class of human genetic variation, but are underrepresented in current databases that are based on small population samples. Recent studies show that variants altering amino acid sequence and protein function are enriched at low variant allele frequency, 2 to 5%, but because of insufficient sample size it is not clear if the same trend holds for rare variants below 1% allele frequency. Results The 1000 Genomes Exon Pilot Project has collected deep-coverage exon-capture data in roughly 1,000 human genes, for nearly 700 samples. Although medical whole-exome projects are currently afoot, this is still the deepest reported sampling of a large number of human genes with next-generation technologies. According to the goals of the 1000 Genomes Project, we created effective informatics pipelines to process and analyze the data, and discovered 12,758 exonic SNPs, 70% of them novel, and 74% below 1% allele frequency in the seven population samples we examined. Our analysis confirms that coding variants below 1% allele frequency show increased population-specificity and are enriched for functional variants. Conclusions This study represents a large step toward detecting and interpreting low frequency coding variation, clearly lays out technical steps for effective analysis of DNA capture data, and articulates functional and population properties of this important class of genetic variatio

    Platypus globin genes and flanking loci suggest a new insertional model for beta-globin evolution in birds and mammals

    Get PDF
    Background: Vertebrate alpha (α)- and beta (β)-globin gene families exemplify the way in which genomes evolve to produce functional complexity. From tandem duplication of a single globin locus, the α- and β-globin clusters expanded, and then were separated onto different chromosomes. The previous finding of a fossil β-globin gene (ω) in the marsupial α-cluster, however, suggested that duplication of the α-β cluster onto two chromosomes, followed by lineage-specific gene loss and duplication, produced paralogous α- and β-globin clusters in birds and mammals. Here we analyse genomic data from an egg-laying monotreme mammal, the platypus (Ornithorhynchus anatinus), to explore haemoglobin evolution at the stem of the mammalian radiation. Results: The platypus α-globin cluster (chromosome 21) contains embryonic and adult α- globin genes, a β-like ω-globin gene, and the GBY globin gene with homology to cytoglobin, arranged as 5'-ζ-ζ'-αD-α3-α2-α1-ω-GBY-3'. The platypus β-globin cluster (chromosome 2) contains single embryonic and adult globin genes arranged as 5'-ε-β-3'. Surprisingly, all of these globin genes were expressed in some adult tissues. Comparison of flanking sequences revealed that all jawed vertebrate α-globin clusters are flanked by MPG-C16orf35 and LUC7L, whereas all bird and mammal β-globin clusters are embedded in olfactory genes. Thus, the mammalian α- and β-globin clusters are orthologous to the bird α- and β-globin clusters respectively. Conclusion: We propose that α- and β-globin clusters evolved from an ancient MPG-C16orf35-α-β-GBY-LUC7L arrangement 410 million years ago. A copy of the original β (represented by ω in marsupials and monotremes) was inserted into an array of olfactory genes before the amniote radiation (>315 million years ago), then duplicated and diverged to form orthologous clusters of β-globin genes with different expression profiles in different lineages.Vidushi S. Patel, Steven J.B. Cooper, Janine E. Deakin, Bob Fulton, Tina Graves, Wesley C. Warren, Richard K. Wilson and Jennifer A.M. Grave

    A second planet transiting LTT 1445A and a determination of the masses of both worlds

    Get PDF
    K.H. acknowledges support from STFC grant ST/R000824/1.LTT 1445 is a hierarchical triple M-dwarf star system located at a distance of 6.86 pc. The primary star LTT 1445A (0.257 M⊙) is known to host the transiting planet LTT 1445Ab with an orbital period of 5.36 days, making it the second-closest known transiting exoplanet system, and the closest one for which the host is an M dwarf. Using Transiting Exoplanet Survey Satellite data, we present the discovery of a second planet in the LTT 1445 system, with an orbital period of 3.12 days. We combine radial-velocity measurements obtained from the five spectrographs, Echelle Spectrograph for Rocky Exoplanets and Stable Spectroscopic Observations, High Accuracy Radial Velocity Planet Searcher, High-Resolution Echelle Spectrometer, MAROON-X, and Planet Finder Spectrograph to establish that the new world also orbits LTT 1445A. We determine the mass and radius of LTT 1445Ab to be 2.87 ± 0.25 M⊕ and 1.304-0.060+0.067 R⊕, consistent with an Earth-like composition. For the newly discovered LTT 1445Ac, we measure a mass of 1.54-0.19+0.20 M⊕ and a minimum radius of 1.15 R⊕, but we cannot determine the radius directly as the signal-to-noise ratio of our light curve permits both grazing and nongrazing configurations. Using MEarth photometry and ground-based spectroscopy, we establish that star C (0.161 M⊙) is likely the source of the 1.4 day rotation period, and star B (0.215 M⊙) has a likely rotation period of 6.7 days. We estimate a probable rotation period of 85 days for LTT 1445A. Thus, this triple M-dwarf system appears to be in a special evolutionary stage where the most massive M dwarf has spun down, the intermediate mass M dwarf is in the process of spinning down, while the least massive stellar component has not yet begun to spin down.Publisher PDFPeer reviewe

    New insights into the genetic etiology of Alzheimer's disease and related dementias

    Get PDF
    Characterization of the genetic landscape of Alzheimer's disease (AD) and related dementias (ADD) provides a unique opportunity for a better understanding of the associated pathophysiological processes. We performed a two-stage genome-wide association study totaling 111,326 clinically diagnosed/'proxy' AD cases and 677,663 controls. We found 75 risk loci, of which 42 were new at the time of analysis. Pathway enrichment analyses confirmed the involvement of amyloid/tau pathways and highlighted microglia implication. Gene prioritization in the new loci identified 31 genes that were suggestive of new genetically associated processes, including the tumor necrosis factor alpha pathway through the linear ubiquitin chain assembly complex. We also built a new genetic risk score associated with the risk of future AD/dementia or progression from mild cognitive impairment to AD/dementia. The improvement in prediction led to a 1.6- to 1.9-fold increase in AD risk from the lowest to the highest decile, in addition to effects of age and the APOE ε4 allele

    Finishing the euchromatic sequence of the human genome

    Get PDF
    The sequence of the human genome encodes the genetic instructions for human physiology, as well as rich information about human evolution. In 2001, the International Human Genome Sequencing Consortium reported a draft sequence of the euchromatic portion of the human genome. Since then, the international collaboration has worked to convert this draft into a genome sequence with high accuracy and nearly complete coverage. Here, we report the result of this finishing process. The current genome sequence (Build 35) contains 2.85 billion nucleotides interrupted by only 341 gaps. It covers ∼99% of the euchromatic genome and is accurate to an error rate of ∼1 event per 100,000 bases. Many of the remaining euchromatic gaps are associated with segmental duplications and will require focused work with new methods. The near-complete sequence, the first for a vertebrate, greatly improves the precision of biological analyses of the human genome including studies of gene number, birth and death. Notably, the human enome seems to encode only 20,000-25,000 protein-coding genes. The genome sequence reported here should serve as a firm foundation for biomedical research in the decades ahead

    Detection and characterization of small insertion and deletion genetic variants in modern layer chicken genomes

    Get PDF
    Background: Small insertions and deletions (InDels) constitute the second most abundant class of genetic variants and have been found to be associated with many traits and diseases. The present study reports on the detection and characterisation of about 883 K high quality InDels from the whole-genome analysis of several modern layer chicken lines from diverse breeds. Results: To reduce the error rates seen in InDel detection, this study used the consensus set from two InDel-calling packages: SAMtools and Dindel, as well as stringent post-filtering criteria. By analysing sequence data from 163 chickens from 11 commercial and 5 experimental layer lines, this study detected about 883 K high quality consensus InDels with 93 % validation rate and an average density of 0.78 InDels/kb over the genome. Certain chromosomes, viz, GGAZ, 16, 22 and 25 showed very low densities of InDels whereas the highest rate was observed on GGA6. In spite of the higher recombination rates on microchromosomes, the InDel density on these chromosomes was generally lower relative to macrochromosomes possibly due to their higher gene density. About 43-87 % of the InDels were found to be fixed within each line. The majority of detected InDels (86 %) were 1-5 bases and about 63 % were non-repetitive in nature while the rest were tandem repeats of various motif types. Functional annotation identified 613 frameshift, 465 non-frameshift and 10 stop-gain/loss InDels. Apart from the frameshift and stopgain/loss InDels that are expected to affect the translation of protein sequences and their biological activity, 33 % of the non-frameshift were predicted as evolutionary intolerant with potential impact on protein functions. Moreover, about 2.5 % of the InDels coincided with the most-conserved elements previously mapped on the chicken genome and are likely to define functional elements. InDels potentially affecting protein function were found to be enriched for certain gene-classes e.g. those associated with cell proliferation, chromosome and Golgi organization, spermatogenesis, and muscle contraction. Conclusions: The large catalogue of InDels presented in this study along with their associated information such as functional annotation, estimated allele frequency, etc. are expected to serve as a rich resource for application in future research and breeding in the chicken
    corecore